41 research outputs found

    Sparsifying to optimize over multiple information sources: an augmented Gaussian process based algorithm

    Get PDF
    AbstractOptimizing a black-box, expensive, and multi-extremal function, given multiple approximations, is a challenging task known as multi-information source optimization (MISO), where each source has a different cost and the level of approximation (aka fidelity) of each source can change over the search space. While most of the current approaches fuse the Gaussian processes (GPs) modelling each source, we propose to use GP sparsification to select only "reliable" function evaluations performed over all the sources. These selected evaluations are used to create an augmented Gaussian process (AGP), whose name is implied by the fact that the evaluations on the most expensive source are augmented with the reliable evaluations over less expensive sources. A new acquisition function, based on confidence bound, is also proposed, including both cost of the next source to query and the location-dependent approximation of that source. This approximation is estimated through a model discrepancy measure and the prediction uncertainty of the GPs. MISO-AGP and the MISO-fused GP counterpart are compared on two test problems and hyperparameter optimization of a machine learning classifier on a large dataset

    A Hyper-Solution Framework for SVM Classification: Application for Predicting Destabilizations in Chronic Heart Failure Patients

    Get PDF
    Support Vector Machines (SVMs) represent a powerful learning paradigm able to provide accurate and reliable decision functions in several application fields. In particular, they are really attractive for application in medical domain, where often a lack of knowledge exists. Kernel trick, on which SVMs are based, allows to map non-linearly separable data into potentially linearly separable one, according to the kernel function and its internal parameters value. During recent years non-parametric approaches have also been proposed for learning the most appropriate kernel, such as linear combination of basic kernels. Thus, SVMs classifiers may have several parameters to be tuned and their optimal values are usually difficult to be identified a-priori. Furthermore, combining different classifiers may reduce risk to perform errors on new unseen data. For such reasons, we present an hyper-solution framework for SVM classification, based on meta-heuristics, that searches for the most reliable hyper-classifier (SVM with a basic kernel, SVM with a combination of kernel, and ensemble of SVMs), and for its optimal configuration. We have applied the proposed framework on a critical and quite complex issue for the management of Chronic Heart Failure patient: the early detection of decompensation conditions. In fact, predicting new destabilizations in advance may reduce the burden of heart failure on the healthcare systems while improving quality of life of affected patients. Promising reliability has been obtained on 10-fold cross validation, proving our approach to be efficient and effective for an high-level analysis of clinical data

    Improving analytics in urban water management: a spectral clustering-based approach for leakage localization

    Get PDF
    Worldwide growing water demand has been forcing utilities to successfully manage their costs. Contemporarily, within an era of tight budgets in most economic and social sectors, it affects also Water Distribution Networks (WDN). So, an efficient urban water management is needed to get a balance between consumer satisfaction and infrastructural assets inherent to WDN. Particular case is referred to pipe networks which suffer for frequent leaks, failures and service disruptions. The ensuing costs due to inspection, repair and replacement, are a significant part of operational expenses and give rise to difficult decision making. Recently, the goal regarding the improvement of the traditional leakage management process through the development of analytical leakage localization tools has been brought to the forefront leading to the proposal of several approaches. The basis of all methods relies on the fact that leaks can be detected correlating changes in flow to the output of a simulation model whose parameters are related to both location and severity of the leak. This paper, starting from a previous work of the authors, shows how the critical phases of leak localization can be accomplished through a combination of hydraulic simulation and clustering. The research deals with the benefits provided by Spectral Clustering which is usually adopted for network analysis tasks (e.g., community or sub-network discovery). A transformation from a data points dataset, consisting of leakage scenarios simulated through a hydraulic simulation model, to a similarity graph is presented. Spectral Clustering is then applied on the similarity graph and results are compared with those provided by traditional clustering techniques on the original data points dataset. The proposed spectral approach proved to be more effective with respect to traditional clustering, having a better performance to analytically localize leaks in a water distribution network and, consequently, reducing costs for intervention, inspection and rehabilitation.Peer ReviewedPostprint (published version

    A Wasserstein distance based multiobjective evolutionary algorithm for the risk aware optimization of sensor placement

    Get PDF
    Abstract In this paper we propose a new algorithm for the identification of optimal "sensing spots", within a network, for monitoring the spread of "effects" triggered by "events". This problem is referred to as "Optimal Sensor Placement" and many real-world problems fit into this general framework. In this paper sensor placement (SP) (i.e., location of sensors at some nodes) for the early detection of contaminants in water distribution networks (WDNs) will be used as a running example. Usually, we have to manage a trade-off between different objective functions, so that we are faced with a multi objective optimization problem. (MOP). The best trade-off between the objectives can be defined in terms of Pareto optimality. In this paper we model the sensor placement problem as a multi objective optimization problem with boolean decision variables and propose a Multi Objective Evolutionary Algorithm (MOEA) for approximating and analyzing the Pareto set. The evaluation of the objective functions requires the execution of a simulation model: to organize the simulation results in a computationally efficient way we propose a data structure collecting simulation outcomes for every SP which is particularly suitable for visualization of the dynamics of contaminant concentration and evolutionary optimization. This data structure enables the definition of information spaces, in which a candidate placement can be represented as a matrix or, in probabilistic terms as a histogram. The introduction of a distance between histograms, namely the Wasserstein (WST) distance, enables to derive new genetic operators, indicators of the quality of the Pareto set and criteria to choose among the Pareto solutions. The new algorithm MOEA/WST has been tested on two benchmark water distribution networks and a real world network. Preliminary results are compared with NSGA-II and show a better performance, in terms of hypervolume and coverage, in particular for relatively large networks and low generation counts

    Efficient Kernel-Based Subsequence Search for Enabling Health Monitoring Services in IoT-Based Home Setting

    Get PDF
    This paper presents an efficient approach for subsequence search in data streams. The problem consists of identifying coherent repetitions of a given reference time-series, also in the multivariate case, within a longer data stream. The most widely adopted metric to address this problem is Dynamic Time Warping (DTW), but its computational complexity is a well-known issue. In this paper, we present an approach aimed at learning a kernel approximating DTW for efficiently analyzing streaming data collected from wearable sensors, while reducing the burden of DTW computation. Contrary to kernel, DTW allows for comparing two time-series with different length. To enable the use of kernel for comparing two time-series with different length, a feature embedding is required in order to obtain a fixed length vector representation. Each vector component is the DTW between the given time-series and a set of "basis" series, randomly chosen. The approach has been validated on two benchmark datasets and on a real-life application for supporting self-rehabilitation in elderly subjects has been addressed. A comparison with traditional DTW implementations and other state-of-the-art algorithms is provided: results show a slight decrease in accuracy, which is counterbalanced by a significant reduction in computational costs
    corecore